regulatory document
Apple accidentally LEAKS the name of its new budget MacBook set to be released today
Kentucky mother and daughter turn down $26.5MILLION to sell their farms to secretive tech giant that wants to build data center there Horrifying next twist in the Alexander brothers case: MAUREEN CALLAHAN exposes an unthinkable perversion that's been hiding in plain sight Hollywood icon who starred in Psycho after Hitchcock dubbed her'my new Grace Kelly' looks incredible at 95 Kylie Jenner's total humiliation in Hollywood: Derogatory rumor leaves her boyfriend's peers'laughing at her' behind her back Tucker Carlson erupts at Trump adviser as she hurls'SLANDER' claim linking him to synagogue shooting Ben Affleck'scores $600m deal' with Netflix to sell his AI film start-up Long hair over 45 is ageing and try-hard. I've finally cut mine off. Alexander brothers' alleged HIGH SCHOOL rape video: Classmates speak out on sickening footage... as creepy unseen photos are exposed Heartbreaking video shows very elderly DoorDash driver shuffle down customer's driveway with coffee order because he is too poor to retire Amber Valletta, 52, was a '90s Vogue model who made movies with Sandra Bullock and Kate Hudson, see her now Model Cindy Crawford, 60, mocked for her'out of touch' morning routine: 'Nothing about this is normal' READ MORE: 'This is insanity for $600': Apple fans BLAST the new iPhone 17e Apple appears to have accidentally leaked the name of its new budget MacBook, ahead of its grand reveal today. The low-cost device is expected to be the final gadget in a flurry of launches this week, following the iPhone 17e, new iPad Air, MacBook Pro and MacBook Air. Eagle-eyed fans have spotted a regulatory document on Apple's website, listing a'MacBook Neo' under the 2026 release section.
LegalRAG: A Hybrid RAG System for Multilingual Legal Information Retrieval
Kabir, Muhammad Rafsan, Sultan, Rafeed Mohammad, Rahman, Fuad, Amin, Mohammad Ruhul, Momen, Sifat, Mohammed, Nabeel, Rahman, Shafin
Natural Language Processing (NLP) and computational linguistic techniques are increasingly being applied across various domains, yet their use in legal and regulatory tasks remains limited. To address this gap, we develop an efficient bilingual question-answering framework for regulatory documents, specifically the Bangladesh Police Gazettes, which contain both English and Bangla text. Our approach employs modern Retrieval Augmented Generation (RAG) pipelines to enhance information retrieval and response generation. In addition to conventional RAG pipelines, we propose an advanced RAG-based approach that improves retrieval performance, leading to more precise answers. This system enables efficient searching for specific government legal notices, making legal information more accessible. We evaluate both our proposed and conventional RAG systems on a diverse test set on Bangladesh Police Gazettes, demonstrating that our approach consistently outperforms existing methods across all evaluation metrics.
Evaluating Retrieval Augmented Generative Models for Document Queries in Transportation Safety
Melton, Chad, Sorokine, Alex, Peterson, Steve
Evaluating Retrieval A ugmented G enerative Models for Document Queries in Transportation Safety C.A. Melton, A. Sorokine, S. Peterson Oak Ridge National Laboratory, Oak Ridge, TN, United States National Security Sciences Directorate ABSTRACT Applications of generative Large Language Models (LLMs) are rapidly expanding across various domains, promising significant improvements in workflow efficiency and information retrieval. However, their implementation in specialized, high - stakes domains suc h as hazardous materials transportation is challenging due to accuracy and reliability concerns. This study evaluates the performance of three fine - tuned generative models -- ChatGPT, Google's Vertex AI, and ORNL Retrieval - Augmented Generation augmented LLaMA 2 and LLaMA in retrieving regulatory information essential for hazardous material transportation compliance in the United States. Utilizing approximately 40 publicly available federal and state regulatory documents, we developed 100 realistic queries relevant to route planning and permitting requirements. Responses were qualitatively rated based on accuracy, detail, and relevance, complemented by quantitative assessments of semantic similarity between model outputs. Results demon strated that the RAG - augmented LLaMA models significantly outperformed Vertex AI and ChatGPT, providing more detailed and generally accurate information, despite occasional inconsistencies. This research introduces the first known application of RAG in tra nsportation safety, emphasizing the need for domain - specific fine - tuning and rigorous evaluation methodologies to ensure reliability and minimize the risk of inaccuracies in high - stakes environments.
NLP-based Regulatory Compliance -- Using GPT 4.0 to Decode Regulatory Documents
Kumar, Bimal, Roussinov, Dmitri
Large Language Models (LLMs) such as GPT-4.0 have shown significant promise in addressing the semantic complexities of regulatory documents, particularly in detecting inconsistencies and contradictions. This study evaluates GPT-4.0's ability to identify conflicts within regulatory requirements by analyzing a curated corpus with artificially injected ambiguities and contradictions, designed in collaboration with architects and compliance engineers. Using metrics such as precision, recall, and F1 score, the experiment demonstrates GPT-4.0's effectiveness in detecting inconsistencies, with findings validated by human experts. The results highlight the potential of LLMs to enhance regulatory compliance processes, though further testing with larger datasets and domain-specific fine-tuning is needed to maximize accuracy and practical applicability. Future work will explore automated conflict resolution and real-world implementation through pilot projects with industry partners.
RegNLP in Action: Facilitating Compliance Through Automated Information Retrieval and Answer Generation
Gokhan, Tuba, Wang, Kexin, Gurevych, Iryna, Briscoe, Ted
Regulatory documents, issued by governmental regulatory bodies, establish rules, guidelines, and standards that organizations must adhere to for legal compliance. These documents, characterized by their length, complexity and frequent updates, are challenging to interpret, requiring significant allocation of time and expertise on the part of organizations to ensure ongoing compliance.Regulatory Natural Language Processing (RegNLP) is a multidisciplinary subfield aimed at simplifying access to and interpretation of regulatory rules and obligations. We define an Automated Question-Passage Generation task for RegNLP, create the ObliQA dataset containing 27,869 questions derived from the Abu Dhabi Global Markets (ADGM) financial regulation document collection, design a baseline Regulatory Information Retrieval and Answer Generation system, and evaluate it with RePASs, a novel evaluation metric that tests whether generated answers accurately capture all relevant obligations and avoid contradictions.
Summarizing long regulatory documents with a multi-step pipeline
Sie, Mika, Beek, Ruby, Bots, Michiel, Brinkkemper, Sjaak, Gatt, Albert
Due to their length and complexity, long regulatory texts are challenging to summarize. To address this, a multi-step extractive-abstractive architecture is proposed to handle lengthy regulatory documents more effectively. In this paper, we show that the effectiveness of a two-step architecture for summarizing long regulatory texts varies significantly depending on the model used. Specifically, the two-step architecture improves the performance of decoder-only models. For abstractive encoder-decoder models with short context lengths, the effectiveness of an extractive step varies, whereas for long-context encoder-decoder models, the extractive step worsens their performance. This research also highlights the challenges of evaluating generated texts, as evidenced by the differing results from human and automated evaluations. Most notably, human evaluations favoured language models pretrained on legal text, while automated metrics rank general-purpose language models higher. The results underscore the importance of selecting the appropriate summarization strategy based on model architecture and context length.
Identification of Regulatory Requirements Relevant to Business Processes: A Comparative Study on Generative AI, Embedding-based Ranking, Crowd and Expert-driven Methods
Sai, Catherine, Sadiq, Shazia, Han, Lei, Demartini, Gianluca, Rinderle-Ma, Stefanie
Organizations face the challenge of ensuring compliance with an increasing amount of requirements from various regulatory documents. Which requirements are relevant depends on aspects such as the geographic location of the organization, its domain, size, and business processes. Considering these contextual factors, as a first step, relevant documents (e.g., laws, regulations, directives, policies) are identified, followed by a more detailed analysis of which parts of the identified documents are relevant for which step of a given business process. Nowadays the identification of regulatory requirements relevant to business processes is mostly done manually by domain and legal experts, posing a tremendous effort on them, especially for a large number of regulatory documents which might frequently change. Hence, this work examines how legal and domain experts can be assisted in the assessment of relevant requirements. For this, we compare an embedding-based NLP ranking method, a generative AI method using GPT-4, and a crowdsourced method with the purely manual method of creating relevancy labels by experts. The proposed methods are evaluated based on two case studies: an Australian insurance case created with domain experts and a global banking use case, adapted from SAP Signavio's workflow example of an international guideline. A gold standard is created for both BPMN2.0 processes and matched to real-world textual requirements from multiple regulatory documents. The evaluation and discussion provide insights into strengths and weaknesses of each method regarding applicability, automation, transparency, and reproducibility and provide guidelines on which method combinations will maximize benefits for given characteristics such as process usage, impact, and dynamics of an application scenario.